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ABSTRACT 

We present a comprehensive set of mock 2dF and SDSS galaxy redshift surveys con- 
structed from a set of large, high-resolution cosmological TV-body simulations. The 
radial selection functions and geometrical limits of the catalogues mimic those of 
the genuine surveys. The catalogues span a wide range of cosmologies, including 
both open and flat universes. In all the models the galaxy distributions are biased 
so as to approximately reproduce the observed galaxy correlation function on scales 
of Mpc. In some cases models with a variety of different biasing prescrip- 

tions are included. All the mock catalogues are publically available at http: / /star- 
www. dur.ac.uk/~cole/mocks/main. html . We expect these catalogues to be a valuable 
aid in the development of the new algorithms and statistics that will be used to anal- 
yse the 2dF and SDSS surveys when they are completed in the next few years. Mock 
catalogues of the PSCZ survey of IRAS galaxies are also available at the same WWW 
location. 

Key words: cosmology: theory - large-scale structure of Universe - galaxies: clus- 
tering 



1 INTRODUCTION 

Our knowledge of large scale stucture in the Universe is go- 
ing to change dramatically as a result of the new genera- 
tion of galaxy redshift surveys now underway. The Anglo- 
Australian 2-degree Field (2dF) galaxy redshift survey will 
measure redshifts for 250,000 galaxies selected from the 
APM galaxy survey (Maddox et al. 1990), and the Sloan 
Digital Sky Survey (SDSS) will obtain a redshift sample of 
one million galaxies. These surveys will be more than one or- 
der of magnitude larger than any existing survey and will al- 
low measurements of large-scale structure of unprecedented 
accuracy and detail. Precise estimates of the standard statis- 
tics that are used to quantify large-scale structure (e.g., the 
galaxy correlation function £(r) and power spectrum -P(fc)) 
will be possible, and the surveys will provide the first op- 
portunity to examine more subtle properties of the galaxy 
distribution. Achieving these goals will require the develop- 
ment of faster algorithms capable of dealing with the very 
large numbers of galaxies involved, and the development of 
new statistical measures. To facilitate both of these tasks 
before the surveys are complete will require synthetic data 
sets on which the techniques can be developed and tested. 
This paper presents an extensive set of mock 2dF and 



SDSS galaxy catalogues. These artificial galaxy redshift cat- 
alogues have been constructed from a series of large, high- 
resolution cosmological A-body simulations. The A-body 
simulations span a wide range of cosmological models, with 
varying values of the density parameter, Qo, and the cosmo- 
logical constant, Ao, and with varying choices of the shape 
and amplitude of the mass fluctuation power spectrum, 
P(k). For some models several different catalogues have been 
produced, each employing a different biasing algorithm to re- 
late the galaxy distribution to the underlying mass distribu- 
tion. All the mock galaxy catalogues have selection functions 
that mimic those expected for the real surveys. The details of 
the construction of catalogues and their basic properties are 
described here. The catalogues themselves can be obtained 
from http://star-www.dur.ac.uk/~cole/mocks/main.html . 

The mock redshift catalogues are the principal scientific 
product of this paper. We expect to use them ourselves as 
we prepare for the analysis of large-scale structure in the 
2dF and SDSS redshift surveys. We are making them publi- 
cally available in the hope that they will be useful to other 
researchers, both inside and outside the two collaborations. 
Our illustrative plots also provide a qualitative prediction 
of the structure expected in these redshift surveys if the 
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leading scenario for structure formation, based on Gaussian 
primordial fluctuations and a universe dominated by cold 
dark matter, is basically correct. The mock catalogues have 
a number of limitations (discussed in §^ below) — for ex- 
ample, the 345. 6/i -1 Mpc simulation cubes are not as large 
as one might like, and we do not model some of the detailed 
selection biases that will affect the real surveys, such as loss 
of members of close galaxy pairs because of a minimum fibre 
separation. The strength of this collection of catalogues is 
that it covers a wide range of theoretically interesting cos- 
mological models in a systematic, homogeneous, and doc- 
umented fashion. We anticipate that these catalogues will 
be especially helpful to researchers who want to test the 
discriminatory power of statistical techniques that probe in- 
termediate scale clustering (~ 1 — lOOh," 1 Mpc) and/or to 
develop practical implementations of these techniques for 
large data sets. Eventually, mock catalogues like these, or 
improved versions of them, will be a valuable tool for com- 
paring the survey data against the predictions of cosmolog- 
ical theories. 

The cosmological models we have selected fall into two 
sets, which we refer to as "COBE normalized" and "struc- 
ture normalized" (or "cluster normalized"). In the COBE 
normalized models, the amplitude of the density fluctua- 
tions is set by the amplitude of the cosmic microwave back- 
ground temperature fluctuations measured by the COBE 
satellite and extrapolated to smaller scales using standard 
assumptions. The shape of the spectrum of density fluctua- 
tions is fixed by applying additional constraints on the age 
of the universe and the baryon fraction. The structure nor- 
malized models are, on the other hand, intended to produce 
approximately the observed abundance of rich galaxy clus- 
ters, and all of them have the same shape for the density 
fluctuation spectrum, chosen to be consistent with exist- 
ing observations of large-scale structure. Each set contains 
both open (Ao = 0) and flat (fio + Ao = 1) models with 
a range of values of fio. Some of the models we consider 
come close to satisfying simultaneously the COBE and clus- 
ter abundance constraints. For each simulation we apply a 
"biasing" algorithm to select galaxies from the TV-body par- 
ticle distribution, choosing its parameters so that the simu- 
lated galaxy population approximately reproduces the am- 
plitude and slope of the observed galaxy correlation function 
on scales ~ 1 — 10/i -1 Mpc. For a few of the models we create 
multiple catalogues using several different biasing schemes, 
so that the sensitivity of methods to the detailed properties 
of biased galaxy formation can be investigated. The COBE 
normalized models arguably have a stronger theoretical mo- 
tivation, since they represent the predictions of models that 
assume inflationary primordial fluctuations and cold dark 
matter with the specified values of fio, Ao, fib, and the Hub- 
ble constant. Since the structure normalized CDM models 
all have the same spectral shape they are particularly useful 
for testing techniques designed to measure fio or Ao indepen- 
dently of an assumed shape of the primordial mass power 
spectrum. We have pr esented analyses of aspec ts of these 
simulations els ewhere (JEke, Cole, fc Frenk 1996 ; Cole et al. 
1997, hereafter |CWFRfpatton fc Cole 1998; ). 

The paper is structured in the following way. The choice 
and parameterization of the cosmological models is discussed 
in Section |^. Section |^ is a full description of all the de- 
tails of our TV-body simulations. The construction of the 



initia l co nditions and their evolution are described in Sec- 
tions ^lland 3.2. The biasing prescriptions are explained in 
SectionTO. The method by which the biased distributions 
are converted into mock galaxy catalogues is presented in 
Section ^J. Our modelling of the survey geometries and se- 
lection functions is detailed in Section 4.2 . Section ^ presents 
plots showing slices of the galaxy distributions in a selection 
of the mock galaxy catalogues. The qualitative differences 
that are discernible in these distributions and the processes 
that give rise to them are discussed. In Section [] we discuss 
the limitations of our approach. Section ^ gives instructions 
on how to obtain and manipulate the mock galaxy cata- 
logues. We conclude in Section H . 



2 COSMOLOGICAL MODELS 



All our cosmological models are variants of the cold dark 
matter (CDM) scenario. The functional form we adopt for 
t he m atter power spectrum is that given by Bardeen et al. 

(EH), 



p(k) 



[1 + 3.89g + (16.1q) 2 + (5.46g) 3 + (6.71g) 4 ] 
„[ln(l + 2.34g)] 2 



1/2 



(2.34g) 2 



(2.1) 



where q = k/T and k — 2tt/X is the wavenumber in units of 
/iMpc" 1 . The index n is the slope of the primordial power 
spectrum, and in all but one case we adopt n = 1, as pre- 
dicted by the simplest models of inflation. Two further pa- 
rameters complete the description of the matter power spec- 
trum. These are the shape parameter F and the amplitude of 
the power spectrum, which we specify through the value of 
as, the linear theory rms fluctuation of the mass contained 
in spheres of radius 8/i -1 Mpc. The background cosmologi- 
cal model in which these fluctuations evolve is specified by 
the density parameter fio and the cosmological constant Ao, 
which we express in units of 3Hq /c 2 , where Ho is the present 
value of the Hubble parameter. Thus, with the exception of 
the one tilted model with n/1, our models are fully speci- 
fied by the values of four parameters, fio, Ao, as and F. For 
each of our twenty models, Table |^ lists the values of these 
parameters along with other parameters that are described 
below. The names we have listed for the cosmo logical m od- 
els are consistent with the convention used in CWFR, but 
in addition we have included (in parentheses) some more 
descriptive names for the various fio = 1 models. 

The COBE -normalized set of models consists of an 
Einstein-de Sitter, fi = 1, model (labelled El or CCDM for 
COBE normalized CDM), three open models with fio = 0.3, 
0.4 and 0.5 (labelled 03-05) and five flat models with 
fio = 0.1-0.5 and fi + A = 1 (labelled L1-L5). We do not 
include COBE normalized open models with fio = 0.1 or 0.2 
because they are hopelessly incons istent w ith the observed 
abundance of rich galaxy clusters ( CWFR ) . For each of the 
open models we choose the value of the Hubble parameter /)Q 
that gives a universe of age t fa 12Gyr, i.e. the largest value 
of h that is consistent with standard globular cluster age 



* We use the convention that h is the value of the Hubble pa- 
rameter in units of 100 km s — 1 Mpc — 1 
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Table 1. Simulation Parameters: the first column gives the label of each of the cosmological models; alternative, more descriptive names 
for the Qo = 1 models are given in parentheses. The following eight columns give the corresponding values of the density parameter 
Qo; cosmological constant Ao, Hubble parameter h, age of the universe t, baryon content f2b, power spectrum shape parameter T, and 
normalization <rg respectively. The final two columns give the initial expansion factors, a;, and number of timcsteps, _/V s t cps , used in the 
iV-body simulation. 



Model 


n 


Ao 


h 


t/Gyr 


n h 


r 


<7g 


"i 


-^stcps 


03 


0.3 


0.0 


0.65 


12.2 


0.030 


0.172 


0.5 


0.15 


93 


04 


0.4 


0.0 


0.65 


11.7 


0.030 


0.234 


0.75 


0.1 


168 


OF, 


0.5 


0.0 


0.6 


12.3 


0.035 


0.27 


0.9 


0.08 


254 


LI 


0.1 


0.9 


0.9 


13.9 


0.015 


0.076 


0.7 


0.15 


150 


L2 


0.2 


0.8 


0.75 


14.0 


0.022 


0.131 


0.9 


0.12 


220 


L3 


0.3 


0.7 


0.65 


14.5 


0.030 


0.172 


1.05 


0.101 


266 


LI 


0.4 


0.6 


0.6 


14.5 


0.035 


0.213 


1.1 


0.09 


275 


L5 


0.5 


0.5 


0.6 


13.5 


0.035 


0.27 


1.3 


0.07 


331 


02S 


0.2 


0.0 








0.25 


1.44 


0.028 


447 


03S 


0.3 


0.0 








0.25 


1.13 


0.050 


313 


04S 


0.4 


0.0 








0.25 


0.95 


0.073 


258 


05S 


0.5 


0.0 








0.25 


0.83 


0.096 


212 


L2S 


0.2 


o.s 








0.25 


1.44 


0.057 


405 


L3S 


0.3 


0.7 








0.25 


1.13 


0.080 


287 


L4S 


0.4 


0.6 








0.25 


0.95 


0.102 


224 


L5S 


0.5 


0.5 








0.25 


0.83 


0.122 


184 


El (CCDM) 


1.0 


0.0 


0.5 


13.1 




0.5 


1.35 


0.061 


327 


E2 (tilted) 


1.0 


0.0 


0.5 


13.1 


0.05 




0.55 


0.20 


200 


E3S (tCDM) 


1.0 


0.0 








0.25 


0.55 


0.21 


103 


E4 (SCDM) 


1.0 


0.0 


0.5 


13.1 




0.5 


0.55 


0.15 


170 



estimates ( Chaboyer et al. 1996 Renzini et al. 1996 ; Salaris, 
Degl'Innocenti & Weiss 1997). For each of the low-Oo fiat 
models we choose the value of h that gives t ~ 14Gyr. For 
Slo = 1 we take h — 0.5. Having chosen these values of h, we 
fix the baryon fraction in these models using the co nstraint 



from p rimo rdial nucleosynthesis of fib = 0.0125/i (Walker 



et al. 1991). We then use the following expression for the 
shape parameter V, 



r = Qoh exp(— f2b — fib/fio) 



(2.2) 



which approximately accounts for the effect of baryons on 
the transfer function ( Bugiyama 19951 )^. Finally, in each of 
these models the amplitude of the density perturbations is 
set so as to be consistent with the COBE measurements of 



fluctuations in the cosmic microwave background ( 3moot et 
al. 19921 ). Further details of these models can be found in 
CWFRT which examines the abundance of galaxy clusters 



in COBE normalized CDM and presents some analysis of 
clustering of the mass distributions. 

For the set of structure normalized models, we adopt a 
fixed value of T = 0.25, as suggested by obser vations of the 
large-s cale structure traced by galaxies (e.g. Vladdox, Efs- 
tathiou & Sutherland 1990a). The amplitude of the power 

0.6 



spectrum we set according to the formula, ag = 0.55fio 
which results in an abundance of rich galaxy clusters in good 
agreement with observations (White et al. 1993a). These 



models include the Einstein-de Sitter model E3S (of which 
we have two realizations labelled E3S A and E3S B), a series 



of open models with fi = 0.2-0.5 (labelled 02S-05S), and 
a series of fiat models with Qo = 0.2-0.5 and Qo + Ao = 1 
(labelled L2S-L5S). Physically, these models could be pro- 
duced either by having h — r/fio or by a change from the 
standard model of the present energy density in relativistic 
particles. For example the E3S mod el is very similar to the 
rCDM model of Jenkins et al. (1997), which is motivated by 
t he de caying particle model proposed by Bond & Efstathiou 
( |l99l| ). The final model listed in Table §, E4 (SCDM), is the 
"standard" CDM model (T — h = 0.5), normalized by the 
abundance of galaxy clusters. 

We consider one further model that falls into both the 
COBE and structure normalized categories. This is the tilted 
Einstein-de Sitter model, E2 (tilted). For this model, the 
above constraints have been applied in relating the baryon 
fraction Qb, the Hubble parameter h, and the shape param- 
eter r, but in addition the slope n of the primordial power 
spectrum has been adjusted to match COBE observations 
at large scales while simultaneously achieving erg = 0.55, 
as required for consistency with the observed cluster abun- 
dances. This procedure results in a tilted primordial spec- 
trum with n — 0.803 and a transfer function with F = 0.4506 
as given by equation 2.2. In normalizing to the COBE obser- 
vations, we have included a gravitational wave contribution 
as predicted by the power-law model of inflation. For our 
model gravitational waves contribute approximately 55% of 
the rms temperature fluctuations on the scales probed by 
COBE. 



< The expression for V which we have adopted is from the original 
version of the Sugiyama (1995) paper and differs slightly from 
the expression in the published version of that paper, which was 
modified to improve its accuracy for high values of f2b- 



3 N-BODY SIMULATIONS 

We now describe how the initial conditions of our simula- 
tions were set up, how the simulated mass distribution was 
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propagated to the present day, and how the particles labelled 
as galaxies were selected. 



3.1 Initial conditions 

Before imposing the desired density perturbations, we set 
up a 'u niform ' distribution using the technique desc ribed by 
White ( |l994| ) and Baugh, Gaztanaga & Efstathiou ( p.995| ) to 
generate a particle distribution with a 'glass' configuration. 
This was achieved by first randomly placing f 92 3 particles 
throughout the simulation box and then evolving this dis- 
tribution with the iV-body code, but with the sign of the 
gravitational forces reversed. We used large timesteps which 
were approximately logarithmically spaced in expansion fac- 
tor and evolved the distribution until the gravitational forces 
on all particles practically vanished. With this approach, the 
initial particle distribution is not regular, but the small ran- 
dom fluctuations in the particle density field do not seed the 
growth of spurious structures. Simulations with glass and 
grid initial conditions have been found to give very similar 
statistic al results once they are evolve d into the nonlinear 
regime flWhite 1994| ; [Baugh et al. 1995|) , but the simulations 
with glass initial conditions have the advantage that they do 
not retain an unseemly grid signature in uncollapsed regions. 

Each of the simulations was of a periodic box of side 
345.6ft" 1 Mpc (192 x 1.8ft" 1 Mpc). For each, we created a 
Gaussian random density field on a 192 3 grid, using the same 
Fourier phases from one model to the next, but varying the 
mode amplitudes according to the model power spectrum. 
We applied the Zel'dovich approximation to this density field 
to compute displacements and peculiar velocities at each 
grid point. We then displaced each particle from its 'glass' 
position according to the displacements interpolated from 
the grid values. The initial expansion factors of the simula- 
tions di, listed in Table || were determined by setting the 
amplitude of the initial power spectrum at the Nyquist fre- 
quency of the particle grid to be 0.3 2 times that for an equiv- 
alent Poisson distribution of particles. Thus Pinitia^for) = 
0.3 2 /n, where n is the mean particle density and the Nyquist 



frequency is kn 



(2vr/3.6)ft Mpc" 1 . The residual 



power in the glass configuration is only 0.5% of that in a 
Poisson distribution at the Nyquist frequency and drops very 



rapidly at longer wavelengths (see figure A2 of Baugh et al 



1995). This choice is safely in the regime where (a) the ini- 



tial density fluctuations are large compared to those present 
in the glass, but (b) the Zel'dovich approximation remains 
accurate. In particular, no shell-crossing has occurred. 



3.2 Evolution 

We evolved the simulations using a modified version of 
Hugh C ouchman's Adapt ive Particle-Particle-Particle-Mesh 
(AP 3 M, ICouchman 199lh iV-body code. We set the soften- 



ing parameter of AP M's triangular-shaped cloud force law 
to rj — 270ft~ kpc, 15% of the grid spacing. The soften- 
ing scale is fixed in comoving co-ordinates. This choice cor- 
responds approximately to a gravitational softening length 
e = 77/3 = 90ft" 1 kpc for a Plummer force law, and we adopt 
e as our nominal force resolution. The size of the timestep Aa 
was chosen so that the following two constraints were sat- 
isfied throughout the evolution of the particle distribution. 



First, the rms displacement of particles in one timestep was 
less than 77/4. Second, the fastest moving particle moved less 
than 77 in one timestep. Initially these two constraints are 
comparable, but at late times the latter constraint is more 
stringent, particularly in low Qq simulations. We monitored 
energy cons ervation using the Layz er-Irvine equation (equa- 
tion 12b of Efstathiou et al. 1985) and found that for this 



choice of timestep, energy conservation with a fractional ac- 
curacy of better than 0.3% was achieved. We also tested the 
inaccuracy incurred by these choices of starting amplitude 
and timestep by comparing the final particle positions with 
two additional versions of the El, Qq = 1 simulations, which 
were run starting from a fluctuation amplitude a factor of 
two lower and using timesteps a factor of two smaller. In 
each case we found that the final particle positions agreed 
very accurately, with rms differences of less than e. More 
importantly, the correlation functions of the particle distri- 
butions in all cases were indistinguishable at scales larger 
than e = 90ft" 1 kpc. Thus, the statistical clustering proper- 
ties of these simulations have a resolution that is limited by 
the particle mass and force softening and not by the choice 
of timestep or starting redshift. 

3.3 Biasing 

In this section we describe the methods we use to select 
the particles we label as galaxies from the distributions of 
mass produced in the iV-body simulations. It is unlikely that 
galaxies are unbiased tracers of the underlying mass distri- 
bution. This would only occur if the ability to form a galaxy 
were independent of the properties of the surrounding den- 
sity field, so that each mass particle no matter where it 
resided was equally likely to be associated with a galaxy. 
Simple, physically motivat e d models such as the high peaks 



model ( Davis et al. 1985 ; Bardeen et al. 1986 ) illustrate 



how a dependence of galaxy formation on the properties 
of the local density field can make the galaxy distribution 
more strongly clustered than the underlying mass distribu- 
tion. This effect can be quantified in terms of a bias factor 
b r — of al /cr™ ass , relating the fractional rms fluctuation in 
the number of galaxies in spheres of radius rft" 1 Mpc to the 
corresponding variation in the mass. 

Obser vatio nal evidence for bias is presented by Peacock 
& Dodds (1994). They assume a simple, constant linear bias 
model in which a perturbation in the mass distribution is 
accompanied by an amplified perturbation in the galaxy dis- 
tribution, <5 ga i = b<5 m ass- They find that the power spectra 
of differently selected galaxy samples require a bias rela- 
tive to the power spectrum of IRAS galaxies, 6/&iras, of 
4.5:1.9:1.3, for Abell clusters, radio galaxies and optically 
selected galaxies respectively. Since a relative bias exists be- 
tween any two of these differently selected samples, it seems 
natural to assume that all galaxy samples will be subject 
to some degree of bias. We note that bias is also impor- 
tant in interpreting the estimates of the mass-to-light ratio 
of galaxies in clusters. These have been used in conjunction 
with estimates of the galaxy luminosity function to infer 
Q w 0.2 (e.g. Carlberg, Yee & Ellingson 1997). This infer- 
ence assumes that galaxies are unbiased tracers of the mass 
distribution. If galaxies form preferentially in proto-cluster 
environments, then this estimate translates to flo/B « 0.2, 
where B is the factor by which the efficiency of galaxy for- 
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mation is enhanced in regions destined to become clusters, 
relative to the field. 

Since the physics of galaxy formation is very complex, 
it is not yet possible to determine the function that relates 
the probability of forming a galaxy to the properties of the 
mass density field, though first steps towards this goal have 
been taken using cosmologi c al simulations with gas dynam- 
ics (|Cen & Ostriker 1992[ Katz, Hernquist, &^Vchibcre 



1993; Hummers, Davis, fc Evrard 1995| ; Frenk ct al. 1996} 
Jenkins et al. 1997). For this reason we take the approach of 
defining our biasing algorithm in terms of a simple paramet- 
ric function. Then for each cosmological model we constrain 
the values of the function's parameters using estimates of ob- 
served small and intermediate scale galaxy clustering. For a 
subset of the cosmological models we repeat this procedure 
for a variety of different biasing algorithms. This enables the 
extent to which the properties of the catalogues depend on 
the arbitrary choice of the adopted biasing algorithm to be 
quantified. 

For opt ically selected galax ies in the APM survey 



gal 



0.96 (Maddox et al. 1996). Many of our simulations 



have <r™ ass > 0.96 and therefore require an anti-bias (6 < 1) 
on the 8/i -1 Mpc scale. Anti-bias seems less physically mo- 
tivated than bias because it requires negative feedback pro- 
cesses to suppress galaxy formation in high-density regions. 
Such an anti-correlation, however, might be produced even 
if the production rate of galaxies in proto-clusters is higher 
than in low-density regions, so long as galaxy merging in the 
proto-clusters is sufficiently efficient to suppress the overall 
number of galaxies in clusters. 

All the biasing schemes we consider are local, in the 
sense that the probability of a mass particle being selected 
as a galaxy is a function only of the neighbouring density 
field, e.g. the density field smoothed on a scale 3/i _1 Mpc. 
Such models have the property that on scales (in the linear 
regime) that are much larger than that defining the neigh- 
bourhood they produce a c onstant, scale independent bias 
( Scherrer & Weinberg 1995 ) . A derivation of an expression 
for this asymptotic bias is given in Section 3.3.2 below. Our 



algorithms include both Lagrangian models, in which the 
selection probability is a function of the initial density field, 
and Eulerian models, in which the probability is a function 
of the final mass density field. For a consideration of the 
difference s betw een these approaches, see Mann, Peacock & 
Heavens ([L9980. 



We use six different prescriptions for creating the biased 
galaxy samples. All of them involve defining a probability 
field from either the initial or the final density distribution, 
and then Poisson sampling the simulation particles using 
this field to define the selection probability. The probability 
is normalized such that a mean of 128 3 out of our origi- 
nal 192 3 particles are selected. This corresponds to a galaxy 
number density n g w 0.05/i 3 Mpc~ 3 , which approximately 
equals that of galaxies brighter than L*/80. Although this 
density is less than that of the original simulation, occasion- 
ally the bias may demand that in certain regions there is 
a greater galaxy density than the original particle density. 
The Poisson sampling achieves this by allowing some par- 
ticles to be selected more than once. This double sampling 
is generally rare but can occur in the highly biased models. 
The functions defining the selection probability have one or 
two free parameters. In the case of those with just one free 



parameter, we fix its value by demanding that erf a = 0.96, 
in agreement with the value estimated from the APM galaxy 
survey. The models with two parameters (a and 0) enable us 
to control both the amplitude of galaxy clustering on large 
scales and, to some extent, the slope of the galaxy correla- 
tion function on small scales. We set their parameters by at- 
tempting to match simultaneously the observed variance of 
the galaxy density field in cubic cells of 5 and 20/t -1 Mpc on 
a side. These, we take to be (J cc n 5 = 2.0 and o" C eii20 = 0.67, 
the values we have obtained from the power spectrum shap e 
estimated for APM galaxies by Baugh & Efstathiou ( 1994 ) , 
scaling its amplitude for consistency with the more recent es- 
timate of <rf al = 0.96. In some cases, where, for instance, the 
small scale mass correlation function is very much steeper 
than the observed galaxy correlation, it does not prove possi- 
ble to simultaneously satisfy these two constraints. For com- 
putational simplicity and to avoid any ambiguity, we choose, 
in all cases, to fit the observed values by minimizing the cost 
function 



C(a,f3) = 



if cell 20 



0.67) 



0.67 

2\ 



+ 



(o~coii5 — 2.0) 
2.0 



The third term has e c — 4 x 10~ 7 and is included to avoid 
extremely large values of \a\ and |/3| being selected for very 
little improvement in the values of 0" ce ii5 and a C eii20- 



3.3.1 Biasing algorithms 

Here we define the selection probability functions, P{v), 
which define each of our biasing algorithms. The resulting 
biased galaxy correlation functions, £M, and power spectra, 
P(k), are shown in Figs. |l|, and g and discussed below. 
The biasing algorithm that we apply to all of the cosmolog- 
ical models is model 1; the other biasing models are used 
only to create additional mock catalogues for the 04S, L3S, 
and E3S simulations. 

Model 1: This model bases the selection probability on 
the value of the smoothed initial density. The initial density 
field is smoothed with a Gaussian of width Rs = 3/i _1 Mpc 
(in exp(— r 2 /2R%)) to define a smoothed density field ps{v) 
at the initial particle position. A dimensionless variable v is 
defined as v(r) — 8s (r)/as , where the density perturbation 
5s(r) — (ps(r) — p) / p , and cr| = (|<5s| 2 )- We then adopt 



P{u) oc 



exp(ai/ + Pu 
exp(tw) 



3/2 1 



if v > 
if v < 



(3.1) 



as the selection probability. The model has two free param- 
eters a and j3. This choice of functional form is essentially 
selected for its simplicity. Its exponential form ensures that 
the probability cannot be negative. The dependence on (3 
for v > enables the selection probability to be enhanced 
(/? > 0) or suppressed (/3 < 0) in the densest regions. It 
is this property which gives some control over the slope of 
the small scale correlation function. The choice of ^ 2 is 
such that the probability converges when integrated over a 
Gaussian distribution of v. 

Model 2: For this model the same functional form 



(equation 3.1 ) is used to define the selection probability, but 
this time the variable v is defined in terms of the smoothed 
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Table 2. Bias Model Parameters: For the three selected cosmological models we list the parameter values required in each of the six 
bias models. The resulting galaxy correlation functions are compared in Fig. 0. 



Identifier 


Model 1 


Model 2 


Model 3 


Model 4 


Model 5 


Model 6 




aj 


ft 




ft 




PT 


a 


Of (3i 


04S 


3.60 


-9.05 


2.17 


-1.31 




<19.7 


-0.02 


3.96 -2.69 


L3S 


2.55 


-17.75 


0.15 


-0.06 




<15.5 


-0.13 


7.11 -7.15 


E3S 


1.10 


-0.56 


1.26 


-0.51 


1.005 


>0.98 


0.56 


2.98 -1.25 



final density field around each particle. Again, a Gaussian 
smoothing with Rs = 3/i _1 Mpc is adopted. 

Model 3: This i s the standard high peaks model of 
Bardeen et al. (1986). Their results are used to predict the 
number of peaks of amplitude v > v p defined on the scale of 
a galaxy as a function of the density smoothed on a larger 
scale that is resolvable in our simulation. In this case we 
choose the larger scale to be defined by applying a sharp 
cutoff to the power at a wavelength A < 4/i _1 Mpc, which is 
quite well resolved in the initial conditions of the simulation. 
We define the galaxy mass scale by a Gaussian smo othin g 
with R s = 0.54/1" 1 Mpc as adopted by White et al. ( [I987[ ). 
Here the model parameter is v v . An unavoidable property 
of assuming that galaxies form in peaks of the density field 
is that they are more clustered than the mass distribution 
(b > 1). Thus this method cannot be applied in cases where 
an anti-bias is required. 

Model 4: In this model a sharp cut-off is applied to the 
final smoothed density field, so that galaxies are entirely pro- 
hibited from forming in very underdense regions, but have 
an equal chance of forming wherever the overdensity rises 
above a certain threshold, pt- Thus 



our cosmological simulations are shown in Fig. |l| and Fig. ^ 
respectively. The solid data points show the estim ates of the 
galaxy correlatio n function, g(r), (|F3augh 1996| ) and power 
spectrum, P(k), ( ]Baugh fc Efstathiou 1993| ) of APM galax- 
ies, scaled in amplitude to match the updated estimate of 



o| al = 0.96 for the APM survey (Maddox et al. 1996). The 



P(u) 



1 if p(r) > pt 
if p(r) < pt 



(3.2) 



This is ttip raap if a hiag grpatpr than unity i« rpqnirprl Fnr 



data points plotted as open symbols on the top left panel 
of Fig. jj] show the APM correlation function as estimated 
from the Fourier transform of the estimated APM power 
spectrum. There is a slight difference between this and the 
direct estimate at large separations, which arises because 
both and P(k) are estimated using non-linear inversions 
of the measured angular correlation function. The difference 
is an indication of one of the systematic errors involved in 
estimating £(r) on large scales. 

In general the two parameter biasing model is success- 
ful in matching both the amplitude and the shape of the 
galaxy correlation function on scales of l-10/i -1 Mpc, as can 
clearly be seen in Fig. jj]. For a few cases, such as El, 02S, 
and L2S, which have high values of cr™ ass and consequently 
steep non-linear mass correlation functions, the bias model 
cannot reduce the slope of the correlation function enough to 
accurately match the observed value. The behaviour of the 
correlation functions on large scales reflects each model's 
value of the power spectrum shape parameter F. The APM 
data, if fitted with a r-model, prefer F — 0.15-0.2 (e.g 



an anti lbias. the conditions are reversed and galaxies are tathiou, Sutherland fc Maddox 199C| ), so even our structure 



Efs- 



prohibited from forming in the very densest regions. Note 
that this prescription for producing anti-bias seems quite 
unphysical, as it implies that the highest mass density re- 
gions have no galaxies at all. 

Model 5: As in model 2 the selection probability is de- 
fined in terms of the smoothed final density, but this time 
the functional form adopted is a power law, 



P(v) oc v a . 



(3.3) 



Here a positive value of the parameter, a, will induce a bias 
(b > 1) and a negative valu e an a nti-bias (b < 1). The bias 
inferred by Cen & Ostriker (1993) from their hydrodynamic 
cosmological simulations has roughly this form, with a ~ 
1.5. 

Model 6: This algo rith m is a variation of model 2 and 
again uses the formula (3.1), but with a different definition 
of the overdensity parameter v. Instead of smoothing on 
a fixed scale of 3/i _1 Mpc, the distribution was adaptively 
smoothed by setting the density at the position of each par- 
ticle, p oc 1/V 10 , where rio is the distance to the 10th nearest 
neighbour of that particle. 

The various galaxy correlation functions and power 
spectra resulting from applying biasing model 1 to each of 



normalized, F — 0.25 models fall short of the amount of 
large-scale power evident in the APM correlation function. 
This short-fall is also exaggerated by a statistical fluctua- 
tion in our simulation initial conditions. As can be seen in 
the top-righthand panel of Fig. ^, the first realization (A) of 
model E3S has less power on scales 0.03 <; k <. 0.06 /iMpc - 
than the second realization (B) of the same model. This 
downward fluctuation in the power is present in all the other 
cosmological models, since all the initial density fields were 
generated from the same basic Gaussian random field but 
with expected mean amplitudes rescaled according to the 
model power spectrum. We also note that the longest wave- 
length modes, with k = 0.018 /iMpc -1 , are noisy because of 
the small number of such modes contained in the simula- 
tion box. The comparison of model and APM galaxy power 
spectra on small scales (high k) is in accord with the small 
scale behaviour of the correlation functions. 

The manner in which the galaxy clustering statistics 
vary with the form of the biasing is illustrated in Fig. ^| 
The 1-parameter bias models (models 3, 4 and 5) do not 
have the flexibility to control both the amplitude and slope 
of the galaxy correlation function. Thus, in general, these 
models do not match the APM galaxy correlation function 
over a wide range of scales. In particular, the galaxy cor- 
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Figure 1. The galaxy correlation functions, £(r), for each of our cosmological models when biased using bias model 1. Each of the lines 
corresponds to a different cosmological model as indicated on the legend. The solid data points are the same on each panel and are an 
estimate of the galaxy correlation function from the APM survey (Baugh 1996). The open data points, shown on the first panel, show 
an alternative estimate of the APM correlation function obtained by Fourier transforming the Baugh & Efstathiou (1993) estimate of 
the APM power spectrum. 



relation functions of the three models selected for Fig. g 
are steeper than the correlation function of APM galaxies, 
reflecting the steepness of the underlying mass correlation 
functions. The 3fo _1 Mpc filter used in bias model 2 smooths 
over the structure of groups and clusters in the final density 
field. As a result, the small-scale slope of the galaxy corre- 



lation function ends up being insensitive to the bias model 
parameters in this case. In model 6, on the other hand, the 
use of an adaptive smoothing results in better resolution 
on the scale of groups and clusters. In some cases this is 
enough to enable the required adjustments to the slope of 
the correlation function on small scales. 
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Figure 2. The galaxy power spectrum, P(k), for each of our cosmological models when biased using bias model 1. Each of the lines 
corresponds to a different cosmological model as indicated on the legend. The data points arc the same on each panel and are an estimate 
of the galaxy power spectrum from the APM survey (Baugh & Efstathiou 1993). 



3.3.2 The asymptotic bias 

In general all the biasing algorithms discussed above give 
rise to a bias that is scale dependent. However, since these 
biasing algorithms only depend on local properties of the 
density field, the bias should tend to a constant on large 
scales. Where the selection probability is a function of the 
initial density field, the value of this asymptotic bias can 
be computed analytically. The probability that a mass par- 



ticle is selected as a galaxy is taken to be P(f), where v 
is the amplitude of the initial density fluctuation in units 
of the rms, a s . The normalization of P(y) is determined by 
the integral over the Gaussian distribution of initial density 
fluctuations, 



1 

2tt 



P{y)e 



-v 2 /2 



dv= 1. 



(3.4) 
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Figure 3. For three selected, structure normalized cosmological 
models (E3S,04S and L3S), we show the galaxy correlation func- 
tions that result from each of the bias models. Note that both of 
the Qq < 1 models require anti-bias and therefore cannot be bi- 
ased using the peaks bias model 3. The line types corresponding 
to each of the bias models are indicated on the legend. The data 
points again show the estimate of the galaxy correlation function 
from the APM survey. 



The density of galaxies selected in a region in which a large 
scale perturbation A is added will be given by 



(1 + A) 



2tt 



J P{v')e 



2 /2 



(3.5) 



where v' = v + A/cr s . A first order series expansion of P(y) 
yields 

dlnP A" 



P{v) = P{v) (l + 
Hence 

p ga l _ (1 + A) 
Pgal 



V27T 

which simplifies to 
&± = (1 + A) (l 

Pgal V 



du a s 



i , dlnP A" 



2 /2 



du, 



2tto s 



av 



(3.6) 



(3.7) 



(3.8) 



We can thus define an asymptotic bias factor, i.e. the ratio 
of the galaxy to the mass perturbations on large scales, as 



1 + 



2tto s 



dP _„2 /2 
—— e du. 
dv 



(3.9) 



This result is compared to the bias estimated from the 
simulations in Fig. ^. The figure clearly shows that the bias 
does indeed tend towards its asymptotic value, as calculated 
above, on large scales. 



4 MOCK CATALOGUES 

In the previous section we described the procedure by which 
we create a galaxy distribution within each simulation cube. 
We now describe how these are manipulated and sampled to 
create the mock galaxy catalogues. It should be noted that 
we do not attempt to mimic the imperfections that will in- 
evitably be present in the genuine catalogues, e.g. , Galactic 
extinction, excluded regions around bright stars, or missing 
members of galaxy pairs separated by less than the mini- 
mum fibre spacing. Our goal is instead to create idealized 
catalogues with the expected redshift distributions and ge- 
ometrical properties of the genuine surveys. We anticipate 
that members of the 2dF and SDSS collaborations will cre- 
ate a few mock catalogues that incorporate the finer details 
of the survey properties. 



4.1 Survey geometry 

The specifications of both the 2dF and Sloan surveys may 
be slightly modified after evaluating the results from the 
current period of test observations. The areas which we have 
adopted are shown in Fig. |B| and defined below. 

The main SDSS area is an elliptical region centred at 
R.A. = 12 hr 20 m S = 32.8°, close to the North Galactic Pole 
(NGP) and covering 3.11 steradians. The minor axis of the 
ellipse spans 110° and runs along a line of constant R.A. The 
major axis spans 130°. Our mock catalogues do not include 
the strips in the Southern Galactic Cap that will also be part 
of the SDSS redshift survey; larger simulation volumes are 
needed to model simultaneously the Northern and Southern 
SDSS. 

The main 2dF survey consists of two broad declination 
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Figure 4. The scale-dependent bias, b(k) = P ga i(fc) / Pmass(fc) , for each of our cosmological models when biased using bias model 1. Each 
of the lines corresponds to a different cosmological model as indic ated o n the Figure. To the right of each panel we show the value of the 
expected asymptotic bias on large scales, as explained in Section 3.3.2. 



strips. The larger is approximately centred on the SGP and 
covers the declination range —22.5° > S > —37.5°. This dec- 
lination range breaks into three contiguous, 5° wide strips, 
each with slightly different ranges in R.A., which from north 
to south are 21 hr 48 m < R.A. < 3 hr 24 m , 21 hr 39.5 m < R.A. < 
3 hr 43.5 m and 21 hr 49 m < R.A. < 3 hr 29 m . The smaller strip 
in the northern galactic hemisphere covers —7.5° < 5 < 2.5° 
and 9 hr 50 m < R.A. < 14 hr 50 m . Together they cover a solid 
angle of 0.51 steradians. There is considerable overlap be- 
tween the northern slice and the area covered by the SDSS. 



4.2 The radial selection function 

The galaxies of the 2dF survey are selected from the APM 
galaxy survey and will be complete to an extinction cor- 
rected apparent magnitude of Bj < 19.45. The SDSS will 
have galaxies selected from its own multi-band digital pho- 
tometry. The primary selection will be made in the Gunn-r 
band, and it will include a surface brightness threshold to 
ensure that an adequ ate fraction of the galax y light goes 
down a 3" fibre (see Gunn fc Weinberg 1991 for details). 
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Figure 5. An equal area (Mollweide) projection of the whole sky showing the regions covered by the 2dF and SDSS galaxy redshift 
surveys. The regions covered by the 2dF survey are indicated by the areas populated by points. These are the galaxy positions for a 
narrow range in redshift from one of our mock catalogues. The 2dF consists of two strips. The larger crosses the SGP while the small 
one runs close to the NGP. The solid curve marks the boundary of the SDSS survey, which is an ellipse centred close to the NGP. We 
do not include the SDSS southern strips. The grid indicates the RA and dec. coordinates. 



For simplicity, and because our goal is merely to match the 
geometry and depth of the two surveys, we make our se- 
lection for both catalogues in the Bj band. For the SDSS 
we adopt a magnitude limit of Bj < 18.9 so as to approx- 
imately reproduce the SDSS target of 900,000 galaxies in 
the survey area. A mock catalogue from a (600/i _1 Mpc) 3 
AT-bodv simrrtetfan thai mimics Lhe SDSS selection function 
in grea ter detail will b e presented elsewhere (G ott et al., 



in preparation; see also |Gluiii fc Weinberg 199? ). In addi- 



tion to' its primary galaxy sample, the SDSS will target a 

set of ~ 100, 000 luminous red elliptical galaxies, to create 
a deep, sparse sample that is approximately volume-limited 
to z ~ 0.4. Similarly, the 2dF programme includes a deep 
extension to R ~ 21 which will contain ~ 10000 galaxies. 
We do not attempt to model these samples because their 
median depths are larger than our simulation cubes. 

In order to compute the radial selection functions of the 
surveys, we adopt a Schechter function description of the Bj 
band luminosity function, 



dL = 0* (L/L*)"* exp(-L/L*)dL/L* 



(4.1) 



d<j}{L) 
dL 

with absolute magnitude Ms , = Mg — 2.5 log lo (L/L0). We 
relate the apparent magnitude Bj of a galaxy at redshift z 
to the corresponding absolute magnitude Mb 3 at redshift 
z — using 

Bj ^e+k+Slog^dL/Zr 1 Mpc)+25+(M Bj -5 log 10 ft). (4.2) 

Here d^ is the luminosity distance to redshift z in the ap- 
propriate cosmological model. The term "k" denotes the so 
called k-correction, which arises from the Doppler shift to 
the wavelength of the galaxy's spectral energy distribution 
when viewed in the observer's frame. The term "e" describes 
the effect of luminosity evolution in the galaxy as a result of 
a combination of passive evolution of the stellar populations 
and star formation. This model therefore allows for lumi- 
nosity evolution, but not for any change in the shape of the 



galaxy luminosity function, which might occur as a result of 
galaxy merging or luminosity dependent evolution. 

Even over the relatively limited range of apparent mag- 
nitudes covered by the APM survey, the galaxy number 
counts are a significantly steeper function of appar ent mag- 
nitude than is predicted by non-evolving models ( Maddox 
ct al. 1990b). In contrast The K-band gala xy counts have 
show n no evidence for such a steep slope Gardner et al. 
1997], but recently Phillips & Turner (1998) have used a 



compilation of survey data to argue that at the brightest 
magnitudes the K-band slope is as steep as that seen in the 
B-band. Unless we live in a very large underdense region or 
there exists some as yet unidentified systematic error in the 
bright galaxy counts, some form of rapid galaxy evolution is 
necessary. The counts can be reproduced by a model with 
strong luminosity evolution such as can be accommodated in 
eqn. (4.2), but at somewhat fainter magnitudes than those 
covered by the SDSS and 2dF surveys such a model predicts 
a tail of high redshift galaxies that is not seen in d eep spec- 
troscopic galaxy samples (e.g. Colless et al. 1990). Thus, a 
more complicated form of evolution is required, either one 
in which different galaxies evolve at different rates or one 
in which galaxies merge so that the number of galaxies is 
not conserved. The new redshift surveys themselves will give 
important information on evolution of the galaxy luminos- 
ity function. However, for the purposes of quantifying large 
structure this is not a problem provided that the selection 
function can be accurately determined. We have therefore 
adopted a simple model that produces a selection function 
with similar depth to that which we expect the surveys to 
have. 

In our standa rd m odel we adopt the parameters found 
by Loveday et al. ( 1992 ) for the APM-Stromlo bright galaxy 
survey, Af Sj — 51og 10 ft = —19.5, a* = —0.97 and = 
1.4 x 10~ 2 /i 3 Mpc~ 3 . We also set k + e = 0, i.e. we assume 
that strong luminosity evolution occurs which cancels the k- 
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Figure 6. Galaxy number counts in our two evolution models 
compared with observational data. Over this range of magnitudes, 
the counts are weakly dependent on cosmology and are plotted 
here for f2o = 1. The solid line corresponds to our standard 
model in which luminosity evolution cancels the k-corrections. 
The dashed line corresponds to the less extreme model in which 
k-corrections are larger than the luminosity evolution. The data 
points are taken from Maddox et al. (1990b) (APM), Heydon- 
Dumbleton, Collins & MacGillivray (1989) (EDSGC), Jones et 
al. (1991) and Metcalfe et al. (1991). 

correction. While this cancellation seems coincidental, Fig. ^ 
shows that this simple choice gives reasonable agreement 
with the observed galaxy number counts at Bj ss 19.5 and 
so will produce mock galaxy catalogues with approximately 
the number of galaxies expected in the 2dF survey. 

As a variation, we have also produced a selection of 
mock catalogues in which the artificial assumption that 
k + e = has been dropped. For these we use the evo- 
lution law k + e = 2.51og 10 (l + z), which corresponds to 
weaker luminosity evolution than in our standard model. 
To compensate for this we increase the value of 0* by 24% 
to 1.73 x 10~ 2 /i 3 Mpc~ 3 to keep the total number of galax- 
ies in the survey approximately the same as in our standard 
model. This model's galaxy counts and redshift distributions 
(for the case of Slo = 1) are shown by the dashed lines in 
Figs. | and 0. 

4.3 Survey construction 

The task of generating a mock galaxy catalogue now consists 
of two steps: choose the location of the observer, and select 
galaxies subject to the geometrical constraints and radial 
selection function specified above. 

To aid in the comparison between the different cosmo- 



Figure 7. The model galaxy redshift distributions. These distri- 
butions arc weakly dependent on cosmology and are plotted here 
for Q,q = 1. The heavy curves, peaking at the higher redshifts, 
correspond to the magnitude limit of Bj < 19.45 of the 2dF sur- 
vey and light lines to the Bj < 18.9 of the SDSS. As in Fig. |, 
the solid curves are for our standard selection function and the 
dashed curves for the alternative model with weaker luminosity 
evolution. The median redshifts are z m = 0.13 and 0.12 for the 
2dF catalogues and z = 0.11 and 0.10 for the SDSS catalogues. 

logical models, we choose to place the observer at the same 
position in each of the galaxy catalogues. The observer's 
position was essentially chosen at random, although we did 
apply the weak constraint that the velocity dispersion of par- 
ticles within 5/i _1 Mpc of the observer should be less than 
350km s -1 in the fi = 1 model, in order to avoid observers 
placed in rich galaxy clusters. This constraint was only di- 
rectly applied in model E3S, but by virtue of the fact that 
all our simulations have the same phases it is effectively sat- 
isfied in all the structure normalized models. However, for 
the COBE normalized simulations that have as greater than 
that required to match the observed abundance of rich clus- 
ters, the galaxy velocity dispersion is typically higher, and 
the constraint may be violated. For most analyses of the 2dF 
and Sloan surveys the choice of the observer should not be 
important, as the volumes of the surveys are large compared 
to the local region whose properties are constrained by the 
choice of observer. 

Having chosen the observer's location, we replicate the 
periodic cube of the A r -body simulation around the observer 
to reach a depth of z = 0.5. We choose the same position 
for the observer in both the 2dF and SDSS surveys, but 
the observer's orientation was not chosen consistently be- 
tween the two surveys. We then loop over all the galaxies 
within the geometrical boundaries of the survey. From the 
model luminosity function and cosmological model we com- 
pute the expected mean number density n s {f) of galaxies 
brighter than the survey magnitude limit at the distance r 
of each of these galaxies. We then select the galaxy zero, one 
or more times according to a Poisson distribution with mean 
n a (r)/n g , where n g is the mean galaxy number density in the 
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biased galaxy distribution described in Section 3.3. In this 



process approximately 1% of the galaxies are selected more 
than once and appear with identical positions and veloci- 
ties in the mock catalogue. This double sampling essentially 
never occurs at z > 0.02, where the selection function drops 
to a space density less than n g . For each selected galaxy we 
generate an apparent Bj magnitude consistent with the se- 
lection function, and also a value of z ma x, defined as the red- 
shift corresponding to the maximum distance at which the 
younger counterpart of the galaxy would still be brighter 
than the survey apparent magnitude limit. In computing 
this redshift we include the effect of both the k-correction 
and evolution on the galaxy's luminosity. As our idealized 
models assume that galaxy mergers do not take place this 
definition of 2 ma x makes it easy to contruct volume limited 
catalogues in which the mean galaxy density is independent 
of redshift. For the genuine surveys removing the effect of 
evolution from the radial dependence of the galaxy density 
field will be more problematic as evolutionary corrections 
for each galaxy will be uncertain and over the limited red- 
shift range probed by these surveys galaxy mergers may also 
play a small role. In our catalogues we record the galaxy red- 
shift, its angular coordinates, the redshift it would have if 
it had no peculiar velocity, its apparent Bj magnitude, and 
Zmax- We also record an index which can be used to identify 
the particle to which it corresponded in the original iV-body 
simulation. 



because the final density field is non-linear, and its short 
wavelength modes are coupled to the linear long wavelength 
modes. With this in mind, we applied the MAP only in com- 
bination with our bias model 1. In order to keep computer 
storage requirements within reasonable bounds, it is nec- 
essary to combine into a single program the application of 
the MAP, the biasing prescription, and the survey selection 
criteria. 



4.5 Inventory 

For each of the cosmological simulations listed in Table 1 
(21, including the second realization of model E3S), we have 
created mock SDSS and 2dF surveys using bias model 1 and 
the standard selection function, in which the evolution and 
k-corrections cancel. The MAP was not used to add long 
wavelength power to these catalogues. For four structure- 
normalized cosmological simulations — the open fio = 0.4 
model (04S), the flat fi = 0.3 (L3S), and the two real- 
izations of the Einstein-de Sitter model (E3S) — we con- 
structed a number of variants: changing the bias model to 
models 2, 3, 4, 5, and 6; without bias; using the variation 
of the selection function described in Section 



4.2 



in which 

luminosity evolution is weaker than the k-corrections; and 
using bias model 1 with long wavelength power added using 
the MAP. 



4.4 Adding long wavelength power 



5 ILLUSTRATIONS 



For a subset of simulations we have applied a technique 
which allows the spectrum of density fluctuations present in 
the final galaxy catalogues to be extended to wavelengths 
longer than those included in the original A-body simu- 
lation. This method, dubbed the Mode Adding Pro cedure 
(MAP), was proposed by Tor men fc Bertschinger ( 1996 ) 
and discussed further by Cole ( 1997 ). Essentially, one uses 
the Zel'dovich approximation with a change of sign to re- 
move from the iV-body particle distribution the displace- 
ments caused by the longest wavelength modes in the origi- 
nal simulation. This can be done accurately if these modes 
are still in the linear regime. One then generates a new large 
scale density field in a much larger box, which samples this 
same region of k-space more finely. Displacements are com- 
puted by the Zel'dovich approximation from this new field 
and used to perturb both the original simulation cube and 
the adjacent replicas. The displacements applied to each of 
the replicas differ, as the new large scale density field is not 
periodic on the scale of the original simulation cube. We 
choose to remove the inner 5 3 modes from the original sim- 
ulations and generate the large scale density field in a box 
with edge 7 times that of the origin al sim ulation (As = 2 
and L/S = 7 in the notation of Co le 1997). 

As pointed out by Cole (1997), it is important that the 



biasing algorithm takes account of the effect of the added 
long wavelength power. This is most easily done for algo- 
rithms such as model 1, which are a function of the initial 
linear density field. One simply replaces the original linear 
density field by a new one constructed by removing the orig- 
inal long wavelength power and adding the new large scale 
density field. It is more complicated to correctly apply a bi- 
asing algorithm that is a function of the final density field, 



We now compare and contrast the visual properties of the 
galaxy distributions in each of the mock catalogues using a 
series of redshift space wedge diagrams. 

Figs. |§| and ^| show the galaxy distribution in redshift 
space slices extracted from the mock 2dF and SDSS cata- 
logues constructed from the cluster-normalized A-body sim- 
ulations. E ach of the catalogues was biased using model 1 
of Section ||[ The 2dF slices (Fig. §) are 90° wide in 
R.A., 3° thick in declination and plotted out to a redshift 
of z = 0.3. By contrast, the SDSS slices (Fig. ^), which 
are 130° wide (corresponding to the full length of the long 
axis of the SDSS ellipse) are 6° thick but plotted only to 
z — 0.2. A visual inspection reveals that the structure in all 
six models looks remarkably similar. This is essentially a re- 
flection of the facts that all the simulations were started with 
the same phases and that the observer is alway located at 
the same position. Also, because these models are designed 
to produce similar abundances of rich galaxy clusters, the 
strength of the "fingers-of-god" effect is also similar. The 
1-dimensional galaxy velocity dispersions in all the cluster- 
normalized models is in the range 440 - 465km s" 1 -. The 
visible effects on the galaxy distribution that result from 
varying Qq, Ao, and the amount of large scale power (r) 
are quite subtle. Of the two fi = 1 models, E3S (rCDM) 
has more large scale power than E4 (SCDM). A manifes- 
tation of this is that structure in E3S (rCDM) appears 
more connected and less choppy than that of E4 (SCDM). 
The changes that occur when Qo is varied are related to 
the strength of galaxy biasing. For models that are normal- 
ized to produce the observed abundance of rich clusters, the 
amplitude of mass fluctuations, as, increases as Qo is de- 
creased. Thus, the fio = 1 models require a strong bias, the 
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Figure 8. Redshift space slices showing galaxy positions from a variety of the mock 2dF galaxy catalogues. Each wedge shows a strip 
90° wide in R.A. and 3° thick in declination, extending to z = 0.3. Each of the six models shown is normalized by the present abundance 
of galaxy clusters and biased using model 1 (see Section 3.3). The inset square panels illustrate the effect of bias by showing the real 
space particle and galaxy distributions in a 100 X 100 X 20/i _1 Mpc slab. The top panels show Qq = 1 models: E3S (tCDM) on the left 
and E4 (SCDM) on the right. Below these are the open and flat f2o = 0.5 models, 05S and L5S, and, at the bottom, the open and flat 
C = 0.2 models, 02S and L2S. 
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Figure 9. Redshift space slices from the cluster normalized mock SDSS catalogues. The correspondence between model and panel is the 
same as for Fig. ^. The slices are 130° wide by 6° thick and extend to z = 0.2. The qualitative differences between the structure visible 
in these slices and in the corresponding 2dF slices of Fig. ^ are due to the choice of slice thickness and depth rather than any intrinsic 
difference in the 2dF and SDSS selection functions. 



Qo = 0.5 models a weak bias, and the Qo = 0.2 models an 
anti-bias. The effect of this can be seen most clearly in the 
inset square panels of Fig. |§| These show, in real space, a 
100 x 100 x 20/i -1 Mpc slab of the mass and correspond- 
ing galaxy distribution, both sampled to the same density 
of n g « 0.05/i 3 Mpc -3 . In the Qo > 0.5 models, the biasing 
algorithm clearly has the effect of mapping underdense re- 
gions in the mass distribution to completely empty voids in 
the galaxy distribution. In the anti-biased, Qo = 0.2 mod- 
els, galaxies continue to trace the mass in the underdense 
regions. Finally, comparison of the open and flat models in- 
dicates that the value of the cosmological constant, Ao, has 
virtually no detectable effect on the galaxy distribution. 

Figs. and show redshift space slices with the same 
geometry as those of Figs. ^ and Q The top left hand pan- 
els in each figure show the tilted Oq = 1 model, E2, which 



by virtue of the tilt is both cluster and CO BE normalized. 
These distributions should be compared with those in the 
upper panels of Figs. |^ and ^, which show corresponding 
slices for our other two cluster normalized, Qo — 1 mod- 
els. The tilted (E2) model appears intermediate in char- 
acter between the rCDM (E3S) and SCDM (E4) models. 
This is consistent with the relative amounts of power on 
scales of 50-100/i _1 Mpc in these models. The tilt of n ~ 0.8 
with F w 0.45 produces more power on these scales than 
SCDM with n= \ and F = 0.5, but less than rCDM with 
n = 1 and V = 0.25. The remaining three panels in Figs. |l(j 
and jll] are for the open (Ao = 0) COBE normalized models. 
In this sequence, as Qo is decreased <rs decreases, the bias 
increases, and F decreases. The most visible effect comes 
from the variation of erg. There is a clear trend such that 
the mass distribution looks more evolved, with more crisply 
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Figure 10. Rcdshift space slices from the mock 2dF catalogues for the tilted CDM model E2 (top left) and the open COBE normalized 
models, 05, 04 and 03. The corresponding value of Qq and the normalization erg are indicated on each panel. The geometry of the slices 
and inset plots of the real space mass and galaxy distributions is the same as in Fig. 



defined filaments and voids, as as is increased. This trend 
is also visible in the galaxy distribution, but here the bias 
partially compensates for the changing as, and the relation- 
ship appears weaker. On small scales the effect of the ran- 
dom velocities within galaxy clusters is just discernible. The 
"fingers-of-god" are largest in the S7q = 0.5 in which the 
galaxies have a mean 1-dimensional velocity dispersion of 
485kms~ 1 — compared to only 225kms _1 — in the Oq = 0.3 

Figs. [li] and [13] show 2dF and SDSS redshift space slices 
for the set of COBE normalized, flat (flo + Ao = 1) models. 
For this sequence of models, as decreases weakly as f2o de- 
creases. Thus we see a weaker version of the same trend we 
noted in the open COBE normalized models. The higher Qo 
models have a more evolved density distribution with more 
sharply defined voids and filaments. Also there is a simi- 
lar trend in the galaxy velocity dispersion and the resulting 
"fmger-of-god" features. The 1-dimensional velocity disper- 
sion is 200kms _1 — for fio = 0.1 and climbs to 665kms _1 — 
for Qo — 0.5. The 'fingers-of-god" are extremely pronounced 
in the Qo — 1 model which has a 1-dimensional velocity dis- 
persion of 890km s -1 — . 

Fig. n shows 2dF redshift space slices illustrating the 



effect of varying the choice of biasing algorithm. Each slice 
was constructed from the same cosmological model E3S 
(rCDM), but with a variety of biasing algorithms as in- 
dicated on each panel. The correlation functions of each of 
these galaxy distributions, shown in Fig. 3, are quite similar. 
Despite this some of the distributions are visually quite dis- 
tinct. The most striking feature is variation in the size and 
number of voids. The voids are largest and most numerous 
in bias model 4 as a result of its sharp density threshold. The 
models in which the bias function is a more gradual function 
of density, such as the power law case of model 5, have far 
fewer voids. The panel at the bottom right shows the effect 
of using the MAP in conjunction with bias model 1 to add 
long wavelength power to the mock catalogue. The distor- 
tion of the small scale galaxy distribution is small as the 
perturbations are of very long wavelength, but there effect 
on measurements of large scale power can be appreciable. 

Fig. [li] contrasts the galaxy distribution in redshift 
space (upper panel) with what would be observed if true dis- 
tances rather than redshifts were measurable (lower panel). 
The model that has been plotted here is the E3S (rCDM) 
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Figure 11. Redshift space slices from the mock SDSS catalogues for the same models as Fig. uj 
open COBE normalized models, 05, 04 and 03. The geometry of the slices is the same as in Fig. 



the tilted CDM model E2 and the 



model with galaxies selected using bias model 1. The thick- 
ness of the slice is just 2° . 



6 LIMITATIONS 

We plan to use the mock catalogues presented in this paper 
to help in the important task of testing and calibrating the 
algorithms and statistics that will be applied to the analysis 
of the 2dF and SDSS redshift surveys. We hope that they 
will be similarly useful to other researchers. However, it is 
important to be aware of the limitations of this compilation 
of mock catalogues. 

(i) The mock catalogues are idealized and do not suffer 
from some problems which, at some level, are inevitable in 
the genuine surveys. These include systematic errors in the 
photometry used to select the target galaxies, cosmetic de- 
fects such as regions cut out around bright foreground stars, 
failure to measure redshifts for 100% of the target galax- 
ies, redshift measurement errors, and the residual effects of 
extinction by foreground dust. 

(ii) The model selection functions are simplistic and do 
not allow for the effects of galaxy mergers. It will only 
become possible to adequately constrain evolution models 
that incorporate galaxy merging once the joint apparent 
magnitude-redshift distributions are accurately measured 
from the surveys themselves. Furthermore, we have not at- 
tempted to mimic the details of the SDSS target selection 
criteria, although we expect that the selection function of 



the SDSS will not differ substantially from that implied by 
the Bj-magnitude limited criterion that we have used. 

(iii) Evolution of clustering over the redshift range of the 
surveys is ignored - each of our mock catalogues is con- 
structed from a single output time from the iV-body simu- 
lations. Clustering evolution is probably very weak over the 
depth of the SDSS and 2dF surveys but it may not be neg- 
ligible for deeper sur veys and will be important for som e 
applications (see, e.g. Nakamura, Matsubara, & Suto 1998). 

(iv) The iV-body simulations solve the equations describ- 
ing Newtonian gravity and therefore explicitly ignore space 
curvature across the simulation box. One consequence of 
this is that in the open models we are forced to use Aivr^dr c , 
where r c is the comoving distance to redshift z, as the vol- 
ume element rather than the correct relativistic expression. 
However for the depth of the present surveys this is a very 
small effect. 

(v) The simulations have limited mass and force resolu- 
tion. The spatial resolution in the initial conditions is lim- 
ited to scales greater than the mean particle separation of 
1.8/i _1 Mpc. However, the power on these scales in the final 
configuration is dominated by non-linear transfer from large 
scales. Thus, the range of reliability of the estimated cor- 
relation functions and power spectra is determined by the 
force resolution, e — 90/i _1 kpc (comoving), and the particle 
mass, m p = 1.64 x lO 12 f2oh -1 M0. The smallest structures 
that are resolved are galaxy groups and clusters. 

(vi) Because of the finite size of the iV-body simulation 
volume, k-space is coarsely sampled and, in the absence of 
the MAP extension, the catalogues have no power in wave- 
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Figure 12. Redshift space slices from the mock 2dF catalogues for the fiat COBE normalized models, El (CCDM), LI, L2, L3, L4 
and L5. The corresponding values of Qo, Ao and the normalization erg are indicated on each panel. The geometry of the slices and inset 
plots of the real space mass and galaxy distributions are the same as in Fig. 0. 



lengths A > 345. 6/i -1 Mpc. Since the depth of the surveys is 
comparable to the size of the TV-body simulations, the coarse 
sampling could be problematic if one were to estimate the 
power spectrum from the mock catalogues using a high res- 
olution estimator at values of k which do not match modes 



in the original simulation. There should be no problems for 
clustering statistics, such as the correlation function, which 
contain contributions from a broad range of k. 

(vii) The application of the MAP extends the power cov- 
erage in the mock catalogues to wavelengths as large as 



© 0000 RAS, MNRAS 000, 000-000 



Mock Redshift Surveys 19 



0.-1,50 




j.-i, l 




a.-M 




fl,-M 




Figure 13. Redshift space slices from the mock SDSS catalogues for the same models as Fig. |l2| the flat COBE normalized models, El 
(CCDM), LI, L2, L3, L4 and L5. The geometry of the slices are the same as in Fig. 



A = 2420/i~ Mpc and improves the sampling of k-space at 
low k (k < 0.026hMpc _1 ), but the sampling of k-space re- 
mains coarse at larger k. Also, the MAP slightly modulates 
the frequencies of the existing high-fc modes, with the result 
that although the high-fc power is still peaked around the 
modes present in the original simulation, some power is dis- 
tributed to neighbouring values of k. Thus, narrow band es- 
timates of the power spectrum at high fc may still be slightly 
affected. 

(viii) The mock catalogues assume galaxies trace the ve- 
locity field of the dark matter and thus that there is no ve- 
locity bias in the sense discu ssed, for example, by Carlberg, 
Couchman & Thomas dl99C|). 



(ix) The adopted models of spatial bias are at best simpli- 
fications of the complex physics of galaxy formation. Since 
reliable a priori predictions of bias are not possible with 
current simulation techniques, we have given each of our 



adopted cosmological models a "good chance" by choosing 
bias parameters that force-fit the amplitude and (to the ex- 
tent possible) the shape of the observed galaxy correlation 
function. Our logic is that if the cosmological model in ques- 
tion is to be consistent with current galaxy clustering data, 
then the "true" description of galaxy formation must some- 
how achieve the same thing that our biasing prescription 
does. In selected cases we have produced multiple mock cat- 
alogues with a variety of biasing algorithms, so that the 
sensitivity of methods to the details of biasing can be inves- 
tigated. 



7 INSTRUCTION MANUAL 

Each mock catalogue can be downloaded from our WWW 
site http://star-www.dur.ac.uk/~cole/mocks/main.html . 
Included in these pages is a detailed description of the cata- 
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Figure 14. Redshift space slices from the mock 2dF catalogues showing the effect of varying the choice of biasing algorithm. Each slice 
was constructed from the same cosmological model E3S (tCDM), but with a variety of biasing algorithms as indicated on each panel. 
The panel at the bottom right shows the effect of using the MAP in conjunction with bias model 1 to add long wavelength power to the 
mock catalogue. The geometry of the slices and inset plots of the real space mass and galaxy distributions are the same as in Fig. pi 
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Figure 15. A comparison of the galaxy distribution in redshift space (upper panel) and real space (lower panel) for a 2° thick slice from 
a SDSS mock catalogue constructed from model E3S (rCDM). 



logue file format. Each of the SDSS catalogue files occupies 
24 Mbytes. The smaller 2dF SGP and NGP catalogues oc- 
cupy 5.4 and 2.7 Mbytes respectively. For each catalogue file 
there is an associated selection function file that tabulates 
the expected number of galaxies and the number density of 
galaxies as a function of redshift for each model. We have 
also made available a number of fortran subroutines. The 
first can be used to read the mock catalogue files. A second 



reads one of the tabulated selection functions and can be 
used to used to generate random galaxy positions consis- 
tent with the survey radial selection function and geometric 
boundaries. 

The main catalogue files list 7 properties for each cat- 
alogued galaxy, x, y, z, z Icst , Bj, z max and i ide nt- The first 
three of these are Cartesian redshift coordinates, i.e. the 
galaxy redshift is z ga i = (x 2 + y 2 + z 2 ) 1 ^ 2 and two angular 
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coordinates are defined by the relations sin 9 = z/z gs ,\ and 
tan0 = y/x. For the 2dF catalogues these angles are simply 
the declination 8 = 9 and Right Ascension R.A. = <f>. In the 
case of the SDSS they instead give a latitude, 9, and longi- 
tude, <f>, relative to a pole at the centre of the SDSS survey 
region and with respect to the major axis of the SDSS el- 
lipse. The quantity z ICBt is the redshift the galaxy would have 
if it had no peculiar motion and was just moving with the 
uniform Hubble flow. The redshift space coordinates can be 
converted to real space coordinates by simply scaling each 
component by the ratio 2 ros t /z ga i • The galaxy's apparent 
magnitude is given by Bj. The maximum redshift at which 
the galaxy would enter into the catalogue taking account of 
the k-correction and luminosity evolution is z max . Thus se- 
lecting galaxies with both z TOBt < z cut and z max > z cut will 
produce a volume limited catalogue to redshift z cut . Note 
that such volume limited catalogues will have a mean co- 
moving number density of galaxies which is independent of 
redshift. This occurs in our idealized models because we have 
assumed that galaxy merging can be ignored over the lim- 
ited redshift range probed by the surveys and because we 
have included both the k-correction and evolutionary cor- 
rection in our definition of z max . The last property, iident, is 
simply an index which relates the galaxy to a particle in the 
original iV-body simulation. 



8 DISCUSSION 

We have constructed, and made publically available, a set 
of mock galaxy catalogues constructed from V-body simu- 
lations having the geometry and selection function appro- 
priate to the forthcoming SDSS and 2dF redshift surveys. 
Our main intention has been to generate an extensive and 
flexible suite of artificial datasets which may be used to de- 
velop, test, and fine-tune statistical tools intended for the 
analysis of the real surveys and, eventually, for testing the 
real data against theoretical predictions. To this purpose we 
have generated mock surveys from simulations with a range 
of cosmological parameters and made a variety of (biasing) 
assumptions for extracting galaxies from the V-body simu- 
lations. 

Our mock catalogues are restricted to CDM cosmolo- 
gies with Gaussian initial fluctuations, but with a range of 
values for the cosmological parameters Qo, A, Ho, spectral 
shape parameter, F, etc. It will be interesting in future to 
extend this kind of work to other cosmological models, par- 
ticularly models that do not assume Gaussian initial fluctu- 
ations. At present it remains somewhat unclear which non- 
Gaussian models will be the most profitable to investigate. 
In our CDM simulations, the fluctuation amplitude is set in 
two alternative ways: by matching the amplitude of cosmic 
microwave background fluctuations as measured by COBE 
(and extrapolated to smaller scales according to standard as- 
sumptions) or by matching the observed abundance of rich 
galaxy clusters. One of our models (tilted Qo = 1) is delib- 
erately constructed so as to match both of these constraints 
while two others (open f2 = 0.4 and flat £l = 0.3) come 
close to doing so on their own right. Although our suite of 
20 models is far from providing a well-sampled grid in this 
multidimensional parameter space, it does include many of 
the cosmological models currently regarded as acceptable. 



We have implemented a variety of biasing prescriptions, 
all of which are designed to reproduce approximately the 
known APM galaxy correlation function over a limited range 
of scales. The motivation for providing alternative biasing 
schemes is to enable tests of the sensitivity to these assump- 
tions of statistics which attempt to infer properties of the 
mass from the measured properties of the galaxies. In the 
absence of reliable theoretical predictions for the formation 
sites of galaxies, we have taken the pragmatic approach of 
using simple formulae, with one or two adjustable parame- 
ters, to characterise the probability that a galaxy has formed 
in a region where the density field has a given value. We have 
considered both Lagrangian and Eulerian schemes in which 
the galaxies are identified in the initial and final density 
fields respectively. We have restricted attention to "local bi- 
asing" models in which the probability depends solely on the 
value of the field smoothed in the local neighbourhood of a 
point. An interesting extension would be to implement non- 
local biasing prescriptions such as the cooperative galaxy 
formation model of Bower et al. (1993). 

Over the range of scales adequately modelled by our N- 
body simulations (~ 1 — lO/i -1 Mpc), our 2-parameter biased 
galaxy distributions match the APM data remarkably well 
in almost all the cosmological models we have considered, 
including those in which an antibias is required on small 
scales. In some cases, a 1-parameter model suffices to obtain 
acceptable results. In all cases the bias in the galaxy dis- 
tribution is scale-dependent even over the relatively narrow 
range of scales covered in our simulations. As discussed by 
Jenkins et al. (1998), scale-dependent biasing is a require- 
ment of all viable CDM models, and it is encouraging that 
simple heuristic models that depend only on local density 
can achieve this, albeit over a limited range of scales. When 
using our mock catalogues it is important to bear in mind 
that while the locations of the galaxies are biased, the veloc- 
ities are not - our galaxies are assumed to share the velocity 
distribution of the associated dark matter. 

A number of extensions of our work are possible. One 
that we have already implemented but not discussed in this 
paper is the construction of mock catalogues with the prop- 
erties of other surveys, particularly surveys of IRAS galax- 
ies like the 1.2 Jy (Strauss et al. 1990) and the PSCZ sur- 
veys (Saunders et al. 1995). Mock catalogues of the latter 
are already available at the same web address as our 2dF 
and SDSS mock catalogues. There are several ways in which 
our catalogues could be improved to overcome at least some 
of the limitations discussed in Section 6. For example, bet- 
ter V-body simulations are certainly possible with current 
technology. Larger simulations would be particularly advan- 
tageous, since the size of those we have used here is compa- 
rable to the depth of the real surveys. The 1-billion particle 
"Hubble Volume" simulation of a 2 Gigaparsec CDM volume 
currently being carried out by the Virgo consortium (Evrard 
ct al. in preparation) will certainly be large enough, and we 
plan to extract mock catalogues from it shortly. An interest- 
ing aspect of this simulation is that data are output along a 
light cone and so the evolution of clustering with lookback 
time can be incorporated into the mock catalogues. Cluster- 
ing evolution is expected to be negligible in the main 2dF 
and SDSS surveys, but it will be important in the proposed 
faint extensions of these surveys and to QSO surveys. 

A further improvement would be to construct ensembles 
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of mock catalogues from independent simulations of each 
cosmological volume. These would help quantify the cosmic 
variance expected in the real surveys. As we discussed in Sec- 
tion 3, sampling effects are still appreciable on large scales 
even with the huge volumes that will be surveyed with the 
2dF and SDSS data. In fact, the fundamental mode in our 
simulations had a noticeable stochastic downward fluctua- 
tion which can confuse the comparison with data on large 
scales. Although this sort of effect can be quantified ana- 
lytically to some extent, simulations are useful in order to 
check for the effects of biasing. Finally, within a given N- 
body simulation, there are already better ways of identifying 
galaxies than the simple heuristic biasing formulae that we 
have used. These new methods consist of grafting into an N- 
body simulation the galaxy formation rules of semi-analytic 
galaxy formation models (e.g. Kauffmann, White & Guider- 
doni 1993; Cole et al. 1994). Examples of this approach al- 
ready exist (Kauffmann et al. 1997; Governato et al. 1998), 
but extensive mock catalogues are still to be constructed us- 
ing this technique. The combined JV-body /semi-analytic ap- 
proach offers the advantage of producing realistic catalogues 



that in< lude internal galaxy properties such as colours, star- 



formation rates, morphological types, etc. Such information 
would be particularly valuable to exploit the photometric 
data of the SDSS survey. 

We are planning to implement several of the improve- 
ments just mentioned and to update our web page as we 
progress. In the meantime we hope that the gallery of mock 
catalogues already available will be of use to researchers in- 
terested in the 2dF and SDSS surveys. 
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